Indexing the Pickup and Drop-Off Locations of NYC Taxi Trips in PostgreSQL - Lessons from the Road
نویسندگان
چکیده
In this paper, we present our experience in indexing the dropoff and pick-up locations of taxi trips in New York City. The paper presents a comprehensive experimental analysis of classic and state-ofthe-art spatial database indexing schemes. The paper evaluates a popular spatial tree indexing scheme (i.e., GIST-Spatial), a Block Range Index (BRIN-Spatial) provided by PostgreSQL as well as a new indexing scheme, namely Hippo-Spatial. In the experiments, the paper considers five evaluation metrics to compare and contrast the performance of the three indexing schemes: storage overhead, index initialization time, query response time, maintenance overhead, and throughput. Furthermore, the benchmark takes into account parameters that affect the index performance, which include but is not limited to: data size, spatial query selectivity, and spatial area density, The paper finally analyzes the experimental evaluation results and highlights the key insights and lessons learned. The results emphasize the fact that there is no one size that fits all when it comes to indexing massive-scale spatial data. The results also prove that modern database systems can maintain a lightweight index (in terms of storage and maintenance overhead) that is also fast enough for spatial data analytics applications. The source code for the experiments presented in the paper is available here: https://github.com/DataSystemsLab/hippo-postgresql
منابع مشابه
Analyzing Urban Human Mobility Patterns through a Thematic Model at a Finer Scale
Taxi trajectories reflect human mobility over a road network. Pick-up and drop-off locations in different time periods represent origins and destinations of trips, respectively, demonstrating the spatiotemporal characteristics of human behavior. Each trip can be viewed as a displacement in the random walk model, and the distribution of extracted trips shows a distance decay effect. To identify ...
متن کاملExploring Spatiotemporal Patterns of Long-Distance Taxi Rides in Shanghai
Floating Car Data (FCD) has been analyzed for various purposes in past years. However, limited research about the behaviors of taking long-distance taxi rides has been made available. In this paper, we used data from over 12,000 taxis during a six-month period in Shanghai to analyze the spatiotemporal patterns of long-distance taxi trips. We investigated these spatiotemporal patterns by compari...
متن کاملOptimizing Cruising Routes for Taxi Drivers Using a Spatio-Temporal Trajectory Model
Much of the taxi route-planning literature has focused on driver strategies for finding passengers and determining the hot spot pick-up locations using historical global positioning system (GPS) trajectories of taxis based on driver experience, distance from the passenger drop-off location to the next passenger pick-up location and the waiting times at recommended locations for the next passeng...
متن کاملSpatial Indexing of Large-Scale Geo-Referenced Point Data on GPGPUs Using Parallel Primitives
Modern positioning and locating technologies, e.g., GPS, have generated huge amounts of geo-referenced point data that are crucial to understand environmental and social-economic phenomena. Unfortunately, traditional disk-resident databases are inefficient in handling large-scale point data. In this study, we propose to utilize the massive data parallel processing power of General Purpose compu...
متن کاملExploring Traffic Dynamics in Urban Environments Using Vector-Valued Functions
The traffic infrastructure greatly impacts the quality of life in urban environments. To optimize this infrastructure, engineers and decision makers need to explore traffic data. In doing so, they face two important challenges: the sparseness of speed sensors that cover only a limited number of road segments, and the complexity of traffic patterns they need to analyze. In this paper we take a f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017